Where and What

نویسندگان

چکیده

Human drivers use their attentional mechanisms to focus on critical objects and make decisions while driving. As human attention can be revealed from gaze data, capturing analyzing information has emerged in recent years benefit autonomous driving technology. Previous works this context have primarily aimed at predicting "where" look lack knowledge of "what" on. Our work bridges the gap between pixel-level object-level prediction. Specifically, we propose integrate an prediction module into a pretrained object detection framework predict grid-based style. Furthermore, are recognized based predicted attended-to areas. We evaluate our proposed method two driver datasets, BDD-A DR(eye)VE. achieves competitive state-of-the-art performance both but is far more efficient (75.3 GFLOPs less) computation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

where do we stand and what comes next?

متن کامل

Baryons: What, When and Where?

We review the current state of empirical knowledge of the total budget of baryonic matter in the Universe as observed since the epoch of reionization. Our summary examines on three milestone redshifts since the reionization of H in the IGM, z = 3, 1, and 0, with emphasis on the endpoints. We review the observational techniques used to discover and characterize the phases of baryons. In the spir...

متن کامل

What Went Where

We present a novel framework for motion segmentation that combines the concepts of layer-based methods and featurebased motion estimation. We estimate the initial correspondences by comparing vectors of filter outputs at interest points, from which we compute candidate scene relations via random sampling of minimal subsets of correspondences. We achieve a dense, piecewise smooth assignment of p...

متن کامل

Where-What Network 1: “Where” and “What” Assist Each Other Through Top-down Connections

This paper describes the design of a single learning network that integrates both object location (“where”) and object type (“what”), from images of learned objects in natural complex backgrounds. The in-place learning algorithm is used to develop the internal representation (including synaptic bottomup and top-down weights of every neuron) in the network, such that every neuron is responsible ...

متن کامل

Stacked What-Where Auto-encoders

We present a novel architecture, the “stacked what-where auto-encoders” (SWWAE), which integrates discriminative and generative pathways and provides a unified approach to supervised, semi-supervised and unsupervised learning without relying on sampling during training. An instantiation of SWWAE uses a convolutional net (Convnet) (LeCun et al. (1998)) to encode the input, and employs a deconvol...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ACM on human-computer interaction

سال: 2022

ISSN: ['2573-0142']

DOI: https://doi.org/10.1145/3530887